Method and system for conducting online marketing research in a controlled manner

ABSTRACT

The invention describes a method and system for conducting online marketing research keeping in consideration the specified budget for the experiment. The invention describes a methodology for effective data collection and optimised utilisation of budget through the use of efficient sampling and grouping of users.

FIELD OF THE INVENTION

[0001] The present invention relates to the field of e-commerce. Moreparticularly the invention pertains to the field of conducting onlineMarketing Research on a pre-defined user-base and with a specifiedbudget.

BACKGROUND OF THE INVENTION

[0002] Marketing research enables a merchant to collect informationabout a product, the market, the competition and the perceptions ofusers.

[0003] The marketing mix is an organisation's overall offer or value tothe customer. The basic marketing mix is often nicknamed “the 4Ps”(product, place/distribution, pricing, promotion); these being theelements in the marketers armory—aspects that can be manipulated to keepahead of the competition. Thus a marketing research study refers tounderstanding and evaluating the impact of changing the existingmarketing mix, i.e. the merchant's current offer to the customer, for aparticular business objective. For example, the business objective maybe to test a coupon promotion. The marketing research study willcomprise of offering different values of coupons to a small subset ofthe users, in the target user segment. The user response is measured interms of the purchase of item being discounted for different discountvalues.

[0004] Some Examples of business objectives to conduct a MarketingResearch are:

[0005] 1. Advertisement decisions

[0006] The merchant wants to decide on the effectiveness of someadvertisements either across the user population or within a particularsegment of the population. Users are selected from the specifiedsegment(s) and are shown the advertisements a fixed number of times. Theresponse is used to determine the effectiveness of each advertisement.

[0007] Advertisement collateral decisions: Merchant may use image-basedrecommendations of the different products. Each image highlights aparticular attribute or feature of the product and the response to eachof the images may be used to target the advertisements to a largerpopulation.

[0008] 2. Pricing studies

[0009] The merchant is contemplating a change in price for brand Xpriced currently at $50. Different prices may be offered by offeringcoupons, discounts, free goods to customers. Let the merchant decidecoupons as a mechanism of offering lower prices. Coupons of value $2,$5, $10 are offered to different users and the response of eachpromotion is compared to determine the price sensitivity of eachsegment. Users who did not respond to coupons of $2 and $5 are againoffered coupons of higher value, $10. The merchant estimates theexpected sales at different price points to arrive at the new price forthe product. A number of coupon values are offered to different users(same as pre-testing a coupon promotion) in the same user segment andthe response of each promotion is compared to determine the pricesensitivity of each segment and finally, arrive at a optimum discount tobe offered to the users.

[0010] 3. Competitive product studies:

[0011] The merchant wants to measure relative cannibalization of brandsB and C when a discount is offered on brand A. Brand switching studiescan be conducted by offering incentives on brand A to a small set ofusers of brands B and C. The choice made by the users and relativeprofitability of brand B and brand C enables the merchant to decide thediscount.

[0012] Product switching studies can be conducted by offering incentivesto users who prefer a particular brand. Based on the last productpurchased by the users, users may be selected and offered coupons on thecompeting product with similar features. Decision of the user to use thecoupon to switch from his preferred product to the competing product,provides merchant about the extent of loyalty in each segment and theimpact of discounts.

[0013] Online marketing-research studies can be conducted by changingthe marketing mix variables and studying the impact of the change on aresponse variable on a user-base that is present online. Variousmechanisms may be used for conducting the online marketing researchexperiments such as online surveys and group discussions for explicit orimplicit feedback on marketing-mix changes.

[0014] To ensure that the results of the online marketing research arenot influenced by factors other than the marketing mix variables beingstudied, it is important that the study is conducted in a controlledmanner. A controlled experiment refers to the scenario where there aretwo sets of users, an experimental set on which the experiment isconducted and a controlled set that is subjected to exactly the sameconditions as the experimental set except that they do not take part inthe experiment. The idea is to compare the effect of a single variablewhile all other variables are fixed. In the domain of marketing mixvariation experiments, it requires offering the new set of marketing mixvalues to a user and comparing the response with another user who hasthe same profile as the first user, but is not subjected to the new setof marketing mix values. For example, in the coupon promotion studymentioned above, if a user has already purchased the item beingdiscounted, her response will be different from-a user who has notpurchased the item in the past. So it is important to match the responseof the users who are offered the coupon against those who are notoffered the coupon, such that both set of users had either purchased theitem in past or not purchased it in past. Similarly, the merchant mayoffer two different surveys to the controlled group and the experimentalgroup, to study and remove the bias introduced by the surveymethodology. The surveys might differ in sequence of questions, sequenceof response options in questions and/or the text description/imagesdisplayed for the questions.

[0015] Any marketing research experiment also has an associated budget,one or more target user segments and a time period for completion. Themerchant desires to complete the experiment within the specified budget,within the defined time period and obtain the required information fromone or more set of target users/customers. For example, in the couponpromotion study mentioned above, the budget could be in terms of thetotal discount offered to the users or the merchant may want to finishthe study before the start of the Christmas season or may wish toconduct the study only on loyal customers.

[0016] The present systems for online as well as offline marketingresearch do not provide effective or efficient sampling of users. Forinstance, users having different profiles are subjected to theexperiment, without first determining whether it would result in asubstantial gain in the data collection. The distribution of users intodifferent experimental and control groups is not based on their profilesand therefore the grouping is ineffective. The selection of theexperiment to be administered to a user group should preferably be basedon the group profile. Since existing groups consist of heterogeneousprofiles such effective implementation is not feasible. Such ineffectivegrouping of users also prevents the comparative study of resultsobtained from different groups of users.

[0017] The present systems are incapable of performing meaningfulsampling in situations where a predefined user base is not available.For instance, first time users randomly arriving at a website in anonline marketing scenario would not be processed or grouped in anymeaningful manner.

[0018] Existing systems allocate the budget in a static manner and donot take into account situations where a deterministic estimate of thebudget consumed at a given point in time is not available. For instance,if coupons are issued as an incentive for participation or as a part ofthe market research study, the redemption rate of coupons is not knowndeterministically. If the system assumes that all the coupons issuedwould be redeemed, the system would overestimate the budget consumed.The expenditure of budget depends on the experiment being administered.Present systems do not dynamically adjust budget allocation amongstdifferent groups and users.

[0019] U.S. Pat. No. 6,112,186 discloses a method for calculating thesimilarity between users and taking intelligent decisions thereof. Theinvention uses collaborative filtering for calculating user similarity.But the similarity thus obtained is not used to maintain control groupfor comparing the result of the experiment.

[0020] U.S. patent application Publication No. 2002/0042738 A1 disclosesan invention that uses pricing and cost function on a case to casebasis. A cost based analysis approach is used to determine whether ornot the experiment should be administered. However the invention doesnot talk about smart expenditure of the budget specified.

[0021] U.S. patent application Publication No. 2002/0035568 A1 definesuser profiling and campaign editing. The invention does not talk aboutusing the user profiles for keeping a control group for the experimentand profile based administration of the experiment.

SUMMARY OF THE INVENTION

[0022] The object of this invention is to provide effective andefficient implementation of market research experiments by enablingoptimal selection and grouping of users for administering theexperiments.

[0023] A second object of the invention is to enable optimal deploymentand utilization of the budget for the market research experimentactivity.

[0024] To achieve these and other objectives and to overcome thelimitations of the existing systems, this invention proposes identifyingexperiment parameters, determining target ideal representative profilesof users to whom the experiment should be administered, comparing eacharriving user's profile with the corresponding representative profileand selecting the user for the experiment and assigning the user to anavailable experiment group or control group based on the match, prior toadministering the experiment to a selected user.

[0025] The ideal representative user profile may already be available ormay be generated dynamically base don the profiles of the users arrivingat the site.

[0026] The invention also describes a system and a program productimplementation of the said method.

BRIEF DESCRIPTION OF THE DRAWINGS

[0027] The novel features believed characteristic of the invention areset forth in the appended claims. The invention itself, however, as wellas a preferred mode of use, further objectives and advantages thereof,will best be understood by reference to the following detaileddescription of an illustrative preferred embodiment when read inconjunction with the accompanying drawings, wherein:

[0028]FIG. 1 shows a basic network structure comprising of the serversconducting the experiment and the clients participating in them.

[0029]FIG. 2 depicts the internal structure of a generic computingsystem on which the invention might be practiced by using them as aserver as well as a client.

[0030]FIG. 3 shows the various components of the system.

[0031]FIG. 4 shows the flowchart for the offline process.

[0032]FIG. 5 is a flowchart for the online process.

[0033]FIG. 6 is a flowchart for the experiment parameter specificationprocess.

[0034]FIG. 7 depicts the flow of process involved while specifying thesmart sampling parameters.

[0035]FIG. 8 shows the offline and online components of the smartsampling subsystem.

[0036]FIG. 9 illustrates the matching subsystem.

DETAILED DESCRIPTON

[0037]FIG. 1 shows a general client-server network on which theinvention might be practiced It consists of one or more servers (1.6)connected to a network (1.1). The network and scope covers all types ofnetworks such as Local Area Network, Internet and the like. Also presenton the network are different clients (1.2, 1.3, 1.4, 1.5). Theinterconnection between different clients on the network is by any knowncommunication means such as wired links, radio links or by infraredtransmissions. The networking topology covers all known topologies suchas star, linear, ring or a combination of any of these. The clients andthe server communicate using any of the known communication protocolssuch as TCP/IP or Ethernet. The number of servers and the clients is notlimited and the data could reside either on one server or could bedistributed over a number of servers The server (1.6) acts as theinformation store and clients are seekers of information sendingrequests to the server for information items contained therein.

[0038] The clients (1.2, 1.3, 1.4, 1.5) comprise be electronic devicessuch as personal computers, mobile phones, interactive televisions andthe like, operated by humans or software agents operating on behalf ofindividuals or organizations

[0039]FIG. 2 shows a block diagram of a general computing system (2.1)on which the invention might be practiced. The computer system (2.1)consists of various subsystems interconnected with the help of a systembus (2.2). The microprocessor (2.3) communicates and controls thefunctioning of other subsystems. The microprocessor (2.3) also acts asthe control unit operating in conjunction with memory (2.4) to performoperations as defined by the stored instructions. In a general computersystem the control module is any commercially available processor ofwhich x86 processors from Intel and 680X0 series from Motorola areexamples. The computing system could be a single processor system or mayuse two or more processors on a single system or over a network. Thiscontrol module also controls the functioning of the other components ofthe computing system (not shown). Control module (2.3) accesses saidmemory (2.4) through system bus (2.2) that interconnects the variousparts of the computing device. The control module executes a programcalled the operating system for the basic functioning of the computersystem. Some examples of operating systems are UNIX, WINDOWS and DOS.These operating systems allocate the computer system resources tovarious programs and facilitate the interaction of users with thesystem. Memory (2.4) supports the microprocessor in its functioning bystoring instructions and data required for program execution. Examplesof memory are random access memory devices such as dynamic random accessmemory (DRAM) or static memory (SRAM). Storage Device (2.5) is used tohold the data and instructions permanent in nature such as the operatingsystem and other programs. Video Interface (2.6) is used as an interfacebetween the system bus and the display device (2.7), which is generallya video display unit such as a monitor. The network interface (2.8) isused to connect the computer with other computers on a network which canbe either a Local Area Network (LAN) or a Wide Area Network (WAN) or anyother type of computer network, through wired or wireless means. Thisnetworking interface can also be used to connect to the Internet. Thecomputer system might also contain a sound card (2.9). The system isconnected to various input devices like keyboard (2.11) and mouse (2.12)and output devices like printer (2.13), through an input/outputInterface (2.10). Various configurations of these subsystems arepossible. It should also be noted that a system implementing the presentinvention might use less or more number of the subsystems than describedabove.

[0040] In the preferred embodiment of the invention, the instructionsare stored on the storage device (2.5) in the form of a computerprogram. This program contains coded instructions for differentalgorithms described herein the specification. On running the program,the instructions are transferred to the memory (2.4) and themicroprocessor (2.3) executes the instructions. The system can bemanually controlled by giving instructions through means of inputdevices such as keyboard (2.11) and mouse (2.12). instructions, whetherfrom the program or from the user input reside in the memory (2.4) andare subsequently acted upon by the microprocessor (2.3). It should beunderstood that the invention is not limited to any particular hardwarecomprising the computer system or the software running on it.

[0041] Those of ordinary skill in the art will appreciate that thevarious means for generating service requests by the clients and theirprocessing by the server are computer programs. These programs arecapable of existing in an embedded form within the hardware of thesystem or may be embodied on various computer readable media. Thecomputer readable media may take the form of coded formats that aredecoded for actual use in a particular information processing system.Computer program means or a computer program in the present context meanany expression, in any language, code, or notation, of a set ofinstructions intended to cause a system having information processingcapability to perform the particular function either directly or afterperforming either or both of the following:

[0042] a) conversion to another language, code or notation

[0043] b) reproduction in a different material form.

[0044] The depicted example in FIG. 2 is not meant to implyarchitectural limitations and the configuration of the incorporatingdevice of the said means may vary depending on the implementation. Anykind of computer system or other apparatus adapted for carrying out themeans described herein can be employed for practicing the invention. Atypical combination of hardware and software is a general purposecomputer system with a computer program that when loaded and executed,controls the computer system such that it carries out the meansdescribed herein. Other examples of the incorporating device that may beused are notebook computers or hand held computers, PDAs etc.

[0045] The proposed system comprises of an offline and an onlinecomponent. The offline component doesn't require the user's presence onthe web site. The first step in the offline process is the specificationof the experiment parameters by the merchant. The parameters comprise,inter alia, the user selection criteria, the experiment details such asthe marketing variables to be tested or the survey to be administeredand the user response variables to be measured, the time period for theexperiment, the smart sampling parameters, the matching parameters andthe experiment budget. Next the system determines if any part of thespecified user selection criteria can be processed in the offline mode,for example, checking a registered user's past traversal history, pasttransactions and demographic information. It checks the existing userprofiles against the criteria and marks the users that satisfy theoffline portion of the user selection criteria When a user visits theweb site, the system may or may not be able to associate the user withany of her previous visits. The possible ways in which a system canassociate the user with her previous visits is by login, cookies onuser's system etc. If the user can be associated with her previousvisits, then she must satisfy the offline selection criteria to beevaluated further by the online component. Otherwise, the user ischecked only against the online selection criteria e.g., the userclickstream during the session, and other user activities like shopping,browsing, chatting. The merchant can configure the user parameters,distance measures, comparison techniques and selection rules.

[0046] When a user visits the web site, the system determines if thebudget allows an online, experiment to be conducted. If the systemdecides that an experiment can be conducted on the user, the budgetavailable flag is set to true. This is required to conduct theexperiment within the specified budget. As the user browses the website, certain user actions may trigger an event to notify the systemthat the user satisfies the online user selection criteria. This allowsthe merchant to target the experiment only to the desired user segment.Note that this event is not triggered if the budget available flag isfalse. In response to the event, the system determines if the user isindeed a good candidate for conducting the experiment. The measure of‘goodness’ can be defined in terms of the possible information gain fromthe user, given the current state of the system. This smart samplingtries to gain the maximum information from a minimum number of users soas to minimize the cost and the total time required to complete theexperiment. If the user is identified as a good candidate for offeringthe experiment, the system checks if the user matches to a previousparticipant. In case the current user matches to a previous participant,the user may be classified to belong to one of the experimental group(s)or the controlled group to which a matched user does not belong. Forillustration, let there be 3 experimental groups, A, B, C and 1controlled group, D in the marketing research study. If a user X matchesan previous user Y who belongs to experimental group B, the user X maybe assigned to either A, C or D. Let X be assigned to D. Now, if anotheruser Z matches X and Y in terms of user features, the system may assignZ to either of the remaining matched groups, i.e., either A or C.

[0047] If the user is assigned to the controlled group, no experimentmay be offered to the user but the user response is measured.Alternatively, if the merchant has defined an experiment for the usersbelonging to the controlled group, that experiment might be offered tothe user and response collected. In case the user is; assigned to one ofthe experiment group(s), the experiment specific to the experimentalgroup may be offered to the user and response collected.

[0048] If the user is classified to belong in the experimental group,the experiment is offered to the user and the response is recorded. Theexperiment offering and the response measurement are based on themerchant specified marketing variable and user response variableparameters. The last step makes sure that the experiment is conducted ina controlled fashion.

[0049]FIG. 3 shows the various components of the system The offline part(3.1) comprises of an experiment specification tool (3.2), an (optional)offline qualification subsystem (3.3) and an (optional) offline smartsampling subsystem (3.4). The online component (3.5) comprises of abudget subsystem (3.6), an online qualification subsystem (3.7), a smartsampling subsystem (3.8), a matching subsystem (3.9), an online storagemedium (3.10) and an experiment engine (3.11). The data analysis andreporting tool (3.12) can be offline as well as online. The offlinequalification subsystem and the smart sampling subsystem may docomputation intensive operations on the user's past transaction data,traversals and other user features. To avoid any performance hits duringthe online process, these computations may be done from an offlinestorage medium (3.13). The offline storage medium can communicate to theonline storage medium if required. All the said components communicateto each other through the means of the network as explained in FIG. 1.

[0050]FIG. 4 shows a flowchart for the offline process. The first stepis the specification of the experiment parameters by the merchant (4.1).The parameters comprise, inter alia, the user selection criteria, themarketing variable to be tested, the user response variable to bemeasured, the time period for the experiment and the experiment budget.Thereafter it is checked whether any offline qualification criteria arespecified (4.2). If yes then the next step is to determine if anyexisting users meet the user selection criteria (4.3). The userselection criteria may comprise of an offline criterion/criteria basedon the user's past purchases, demographics, past clickstreams, etc.and/or an online criteria based on the user's clickstream. The offlinequalification subsystem (of FIG. 3) checks the existing user profilesstored in the offline storage medium (4.4) against the offline criteriaand marks the existing users (4.5) that satisfy the offline portion ofthe user selection criteria. When a user visits the web site, the onlinequalification subsystem (of FIG. 3) checks if the user meets the onlineselection criterion (4.7). The system may or may not be able toassociate the user with any of her previous visits. The possible ways inwhich a system can associate the user with her previous visits is bylogin, cookies on user's system etc. If the user can be associated withher previous visits, then she must satisfy the offline selectioncriteria (4.3) to be evaluated further by the online process. Otherwise(4.6), the user is checked only against the online selection criteria(4.7) e.g., the user clickstream during the session. If no offlinecriterion is specified, no offline qualification is done and all theusers are checked only against the online process.

[0051] The offline smart sampling subsystem, if enabled (4-8), selectsthe ideal representative profiles that provides the maximum expectedinformation gain (4.9). The merchant may do the selection of idealrepresentative profiles in batches. The selection of idealrepresentative profiles is done for each batch. The online smartsampling subsystem compares the features of online users with the storedfeatures of the ideal representative profiles and if the distance of thecurrent online user is within a threshold distance from the idealrepresentative profiles for selection, the experiment may be offered tothe current user and response stored. Once the offline subsystem getsthe response to all the ideal representative profiles, or a specifiedtime limit is exceeded (4.10), it incorporates these response in thesampling for the next batch. This allows the system to dynamicallychoose the ideal representative profiles- Once all the batches have beenevaluated, the merchant is informed about the completion of theexperiment

[0052]FIG. 5 shows a flowchart of the online process. As a user visitsthe web site (5.1), the user may be identified as an existing user(5.2). In case the user is identified as an existing user (5.3), thesystem checks if the offline qualification satisfaction flag is true ornot (5.4). In case the flag is set to ‘No’ then the process stops (5.5).The remaining process is followed only for the user whose flag is true;otherwise the user just browses the site normally. In case the user is anew user or has not yet been identified as an existing user, theremaining process is followed for the user. The budget subsystemdetermines if budget is available to conduct an online experiment on theuser (5.6). If the budget subsystem decides that an experiment can beconducted for the user, the budget available flag is set to true. As theuser browses the web site, certain user actions may trigger an event tonotify the system that the user qualifies the online user selectioncriteria for offering the experiment (5.8). This event is not triggeredif the budget is not available (5.7). The system responds to the onlinecriteria satisfaction event by invoking a query on the smart samplingsubsystem. The purpose of this query is to determine if the user isindeed a good candidate for conducting the experiment (5.9). The measureof ‘goodness’ can be defined in terms of the possible information gainfrom the user, given the current state of the system. If the user isidentified as a good candidate for offering the experiment, and thesmart sampling flag is true (5.10), the matching subsystem determinesthe user's experimental group. The matching subsystem checks if the usermatches to a previous participant with a vacant matched group (5.11). Ifthe user doesn't match a previous participant, a new profile ID for thenew matched set of users is created with the user features of thecurrent user (5.14). The user may be classified to one of the vacantexperimental group(s) or the controlled group (5.12) and accordinglyexperiment is offered or not offered to the user (5.13) and the userresponse is measured (5.15). The experiment engine (of FIG. 3) conductsthe experiment on the selected participant by changing the marketing mixvariables in the manner as determined by the matching subsystem oroffering the multiple surveys associated with each of the matched groupsand measures the response.

[0053] The response variables to be collected are provided by theexperiment specification tool.

[0054] The data is compared with historical information (for historiccomparison studies); else the statistics for response variables acrossthe controlled group and the experiment groups are reported through aGUI.

[0055] The various subsystems involved above are explained below indetail:

[0056] Experiment parameter specification tool

[0057]FIG. 6 shows the details of the experiment parameter specificationprocess, The experiment specification tool allows the merchant (6.1) todescribe the business objective for which the experiment is beingconducted. The order of specification of parameters may or may not bethe same as shown in FIG. 6. FIG. 6 shows one instance of the process.

[0058] There could be a predefined list of objectives that the merchantcan select from or the merchant can specify a new objective (6.2), whichwould be added to the list of existing objectives. If the merchantchooses to specify a new objective (6.2), the merchant has to first namethe experiment (6.3) specify various parameters such as the userselection criteria, the experiment details such as the marketingvariable to be tested and the user response variable to be measured, thetime period for the experiment, the smart sampling parameters, thematching parameters and the experiment budget. For different types ofexperiments, the merchant needs to provide specific details for theexperiment. For example, for pricing experiments, the coupon promotiondetails including values of the coupons to be offered, their expirytime, redemption conditions etc. For catalog reordering experiments, thecategory page on which products have to be reordered and the sequence inwhich they shall be displayed. For explicit experiment or surveys, theset of questions to be asked, the choice types and the labeling of theresponse choices, the sequence in which questions shall be asked or thesequence in which the response choices shall be displayed. A set ofpredefined surveys with templates can also be offered wherein themerchant provides the parameters specific to the templates.

[0059] If the merchant wishes to specify a particular user segment(6.4), the user selection criteria has to be specified (6.5) which maycomprise of an offline and/or online criteria. The user selectioncriteria can be based on the user's past purchases, user's registrationinformation (including demographics, interests, other personalinformation volunteered at the time of registration), user's currentsession clickstream, user's previous session's clickstream history, pastusage of coupons, response to advertisements, product recommendations,history of merchant defined events triggered during previous visits. Themerchant may have defined a set of events and whether those eventsoccurred during the user's previous visit or it could be stored as partof user's profile. The merchant specifies which one (or theircombination) of these criteria she/he wants to use for user selectionand then defines conditions on each of these criteria. The offlinecriteria may be used to preselect a subset of registered users.

[0060] The matching parameters refer to the data required by thematching subsystem (6.6). The merchant selects the user features (6.7)that are to be used for comparing two users. A set of predefineddistance measures is available for merchant selection. The merchant alsoselects a threshold on this distance within which two users can beconsidered to be in close proximity to each other (6.8).

[0061] The merchant specifies whether to use smart sampling or randomsampling (6.9). For random sampling, the merchant specifies the samplingpercentage. If merchant chooses smart sampling (explained in FIG. 7)instead of random sampling, additional parameters related to smartsampling are sought from the merchant or an outside agent (6.10).

[0062] The merchant specifies the budget in terms of a monetary value oras a limit on the number of participants (6.11). The merchant may alsoenable smart budget computation (6.13), as in case of an experiment inwhich the user is offered an object which promises certain value ofbenefit in the future (6.12), the remaining budget available forutilization is not deterministic. In this invention, we propose the useof prediction tools to have a near accurate and real-time estimate ofthe remaining budget. This determines the sample size of users to whichthe experiment should be offered based on the past user responses.

[0063] The merchant further specifies the experiment time period (6.14)that can be specified as a start date/time and an end date/time. This isused by the various online process components to ensure that theexperiment is conducted within the time period.

[0064] Depending on the experiment, the merchant can specify (select) aplurality of marketing variables (6.15) such as the price of a product,discount to be offered on a category, a new service advertisement,testing a product bundle, product design or packaging. The businessobjective of the experiment thus specified is to study the impact of achange in Marketing Variable(s) on selected user with respect to aresponse variable(s) given the control parameters. Response variablesrefer to the variable that is observed or measured as the response tothe experiment. Examples of response variables could be revenue, productpurchase, user brand switching, ad click through rate, coupon acceptancerate etc.

[0065] The offline qualification subsystem

[0066] The offline qualification subsystem preselects a set of users whomeet the offline user selection criteria/criterion as specified by themerchant. The offline qualification criterion can be defined based onregistered users information. Each user for whom the features are storedin the offline storage medium is checked, if the user meets the offlineselection criterion. The qualifying user identifiers and relevant userinformation is stored in the online storage medium for onlineapplication.

[0067] Online Storage Medium

[0068] The online storage medium may be in the form of rules, databasetables, XML or text files. In case of rules being the storage mechanism,the rules are applied through an online rule engine. Otherwise eachtools may use online code fragments for example, applets, and servletsto retrieve information from the online storage medium, process it andstore the outcome back to the storage medium. Alternatively, each toolmay pass text, html, code or XML files to each other containing theinformation required which may be stored in the online memory.

[0069] Budget subsystem

[0070] The role of budget subsystem is to decide if the available budgetpermits offering of the experiment to a user visiting the web site,before doing any further qualification checks on the user. Eachexperiment has an associated cost and this cost is compared with theavailable budget, if the cost is less than the budget, then, the budgetavailable flag is set to TRUE. This may involve a simple check on budgetavailability or a more complex check. A complex check is required (ifsmart sampling is enabled as shown in FIG. 6) where the cost of theexperiment depends on the user's response subsequent to the experimente.g., if the experiment comprises of offering an e-coupon to the user,the cost depends on whether the user accepts the coupon and furtherredeems it. The budget availability determination is not straightforwardas it is hard to determine if the user will eventually redeem the couponor not, even though the system knows if the last user who accepted thecoupon, redeemed it or not. In such cases the budget subsystem can learnfrom the past user actions and do intelligent predictions from them forthe current user. A number of reinforcement and supervised learning andprediction algorithms like linear regression, neural networks, decisiontree etc. can be used to predict the final redemption rate and hence,the budget utilization. The prediction algorithms may use the userspecific features including, but not limited to, demographics, pastpurchases, psychographics, clickstream, survey responses, advertisementand coupon response history and other derived or related features tolearn the redemption rate.

[0071] In an experiment in which the user is offered an object whichpromises certain value of benefit in the future, the remaining budgetavailable for utilization is not deterministic In this invention, wepropose the use of prediction tools to have a near accurate andreal-time estimate of the remaining budget. This determines the samplesize of users to which the experiment should be offered based on thepast user responses.

[0072] As explained in the matching subsystem, a user is assigned to oneof the vacant experimental or control group(s). The cost of conductingan experiment can be different for each of the experimental or controlgroup(s) depending on what experiment is conducted for that specificgroup. In such an instance, the budget available is checked after thecurrent user has been assigned to the group.

[0073] The online qualification subsystem

[0074] The online qualification subsystem takes into account userfeatures which are real-time or can be determined only in real-timee.g., user's click on a particular product or category or user'saddition of an item to the shopping cart. For new users or unregisteredusers, who have not visited the site or users about whom no previouslystored information is available, the qualification subsystem uses onlinecriteria to select them. For users who can be associated with theirprevious history, the online qualification subsystem checks only theusers that satisfy the requirement of the Offline qualification system.For example, the online qualification subsystem may select users whohave traversed a particular path in their clickstream behavior or thoseusers who are at a certain distance away from prototypical clickstreamthat the merchant defines. The selected users may be a subset of usersselected by the Offline qualification subsystem or the outcome ofOffline qualification subsystem may be re-computed in real-time based onthe inputs received by the Online qualification subsystem. Therecomputation may be done, as there may be time gap between the offlinequalification check and the online check. There exists a possibilitythat a user who didn't satisfy the criteria earlier satisfies it now.The online qualification subsystem receives user specific informationfrom the online storage medium and sets the FLAG to be TRUE if the usermeets the online qualification criteria.

[0075] Smart sampling subsystem

[0076]FIG. 7 is a flowchart that depicts the smart sampling parameterspecification by the merchant or an outside agent (7.1). The merchant orthe agent specifies the user features to compare (7.2). If the smartsampling is completely online (7.3) then the distance measure andthreshold is calculated (7.4) which is followed by selecting the modefor batches which could be single or multiple batches (7.5). Thereafterthe batch size and the time allocated to each batch is selected (7.6)and finally the number of respondents desired for every representativeprofile is specified (7.7)

[0077] The smart sampling subsystem as explained in detail in FIG. 8tries to gain the maximum information from a minimum number of users soas to minimize the cost and the total time required to complete theexperiment while the budget subsystem ensures that the experiment isconducted within the merchant's specified budget and dynamicallyestimates the sample size of users to which the experiment should beoffered based on the past user responses, the smart sampling subsystemoptimizes the information gained from a sample of that size.

[0078] The merchant or the outside agent specifies the user features tocompare (8.1). In one instance, the smart sampling subsystem consists ofboth online (8.3) and offline (8.2) components. The offline smartsampling subsystem selects the ideal representative profiles (as definedby the merchant in Experiment parameters), which provides the maximumexpected information gain. The user features of these idealrepresentative profiles are stored in the Online Storage Medium in formof feature values.

[0079] The online smart sampling subsystem (8.3) compares the featuresof online users (8.7) with the stored features of the idealrepresentative profiles. It then computes the distance from all idealrepresentative profiles (8.8) and if the distance of the current onlineuser is within a threshold distance from the ideal representativeprofiles (8.9), the experiment may be offered to the current user andthe online smart sampling subsystem will set its FLAG for the currentuser to be true. (8.10) else the FLAG is set to false (8.1 1)

[0080] The offline smart sampling subsystem may use one of thewell-known active learning algorithms. The merchant defined/selecteduser features (8.4) in the experiment parameters may serve as inputs tothe active learning algorithm (8.5). The merchant may optionally alsoselect a distance metric from a set of predefined distance metrics. Theactive learning algorithm may take further inputs in the form of thethreshold distance, expected arrival rate of users, a spatialdistribution of users along the user feature space and the expectedresponse rate of the individual users. The active learning algorithm maybe based on Bayesian framework and select ideal representative profilesthat will maximize the expected information gain (8.6).

[0081] The merchant may do the selection of ideal representativeprofiles in batches. This means that first a set of ideal representativeprofiles is selected. Then the system waits to get the response forthese ideal representative profiles. Once the offline subsystem gets theresponse to all the ideal representative profiles, or a specified timelimit is exceeded, it incorporates these responses in the sampling forthe next batch. This allows the system to dynamically choose the idealrepresentative profiles.

[0082] The online smart sampling subsystem maintains a count for eachrepresentative profile for the number of users from which the responsehas already been collected (8.12, 8.13). In case of presence ofcontrolled or experimental matched groups, a separate count ismaintained for each of these groups. Once the number of respondents ineach group corresponding to an ideal representative profiles is equal todesired number of points (8.14) the ideal representative profile is madeINACTIVE and no user in close proximity of this ideal representativeprofile is selected thereafter (8.15). The online smart samplingsubsystem selects the ACTIVE ideal representative profiles from the listof representative profiles and computes the distance from the currentuser. Once the specified time period has expired or all therepresentative profiles are INACTIVE, the system recomputes new idealprofiles (8.16).

[0083] Matching subsystem

[0084] The matching subsystem as explained in FIG. 9 ensures thatexperiment is conducted in a controlled and matched manner i.e. forevery user (9.1) who is offered the experiment in the experimentalgroup, the behavior of a similar user in the controlled group iscaptured. The objective is to compare the effect of a single variablewhile all other variables are fixed. Besides if the experiment requiresmore than one values of the marketing variable to be tested together,the matching subsystem also distributes the users across the differentexperimental groups corresponding to different values. For each user whois offered one value of the marketing variable being tested, a similaruser is offered the other value. For example, if he merchant wanted totest three different coupon discount values, there will be threeexperimental groups, one for each discount value, and one control group,with no discount. So there will be a total of four matched groups.Similarly, the merchant may offer multiple surveys to the controlledgroup and the experimental group(s). It may also comprise of multipleversions of the same survey to study and remove the bias introduced bythe survey methodology. The surveys might differ in sequence ofquestions, sequence of response options in questions and/or the textdescription/images displayed for the questions.

[0085] Once an experiment is offered to a user in the experimentalgroup, a profile id is created (9.2). The participant's profile, theparticipant's group i.e. experimental or controlled (in case there aremultiple matched experimental groups, the specific experimental groupid) and the total number of participants who have been offered theexperiment in that group is recorded and stored in an online storagemedium (9.3). The profile of the participant to needs to be stored asagainst the participant's user id because the profile is not static. Itmay change as the participant browses the site further. To ensurecomplete match, we need to record the profile at the time of offeringthe experiment. Also, a FLAG is created for each profile id created,which we refer by MFLAG, which is set to FALSE (9.4). MFLAG indicatesthat there is a vacancy for a user similar to the already sampled atleast one of the other matched groups for this profile id. Another user(9.1) who might arrive at a later point in time is compared to theexisting profiles with MFLAG as false (9.5). The distance computationmay be restricted to representative users belonging to each set ofmatched users (or ideal representative profiles in case of smartsampling being enabled) for which the MFLAG is false (Table 1). Thecandidate user's distance from all the profile ids is calculated (9.6).If the new user is within a predefined threshold distance away from anyof the existing profiles (9.7), the user is considered similar to thatprofile id. A set of predefined distance metric is available for thedistance computation. The distance computation is completed online. Theuser is assigned to one of the vacant groups corresponding to theprofile id (9.8,9.9). If more than one group is vacant, then the onewith the least number of participants may be selected and thecorresponding experiment is offered (or none if the group is thecontrolled group) and matching subsystem flag is set to TRUE (9.10). Thenumber of participants in each group for a particular profile depends onthe available budget, number of batches, batch size and the costassociated with each experiment offer.

[0086] As an illustration, table 1 describes the assignment of users tothe matched groups. For each set of matched users, MFLAG andRepresentative User are also shown. TABLE 1 User Assignment tableExperi- Experi- Experi- Repre- Controlled ment ment ment sentative S.No. group Group 1 Group 2 Group 3 MFLAG User 1 A B C False A 2 E F FalseE 3 T False T 4 X Y U True X 5 Count 3 1 3 2

[0087] If the new user is not similar to any existing profile, but hasbeen identified as a good candidate for the experiment (9.11), and ifrandom sampling has been enabled and the budget is available, then a newprofile id is created (9.2) for the user and he is assigned to any ofthe associated groups for that profile. For example, as shown in Table1, the new user may be assigned to matched group set 5, and experimentgroup 1. The number of users assigned to experiment group 1 is thelowest at this point in time. Also the first user assigned to a matchedgroup set may be considered as a representative user. In case of smartsampling, MFLAG is set to False (9.4).

[0088] For each profile id, the MFLAG is updated whenever a matched userresponds to the experiment. There can be a number of experimental groupsand only one controlled group for each profile. In such a scenario,MFLAG is set to TRUE when all the matched groups have received therequired number of responses from a user similar to the profile id. Forexample, in case the merchant decides to offer multiple price discountsto customers, say 5%, 10% and 15%. A control group may be created towhich no discount is created. In this experiment, one controlled groupis used for three experimental groups. The MFLAG would be set TRUE whenall the four groups have a matched user (matched group set 4 in Table1).

[0089] There are inherent synergies in implementation of matched groupand smart sampling subsystems. Each profile id in a matched group shouldbe close to an ideal representative profile identified by the smartsampling subsystem. This is because the user is identified as a goodcandidate only if the user is close to at least one of the idealrepresentative profiles. The ideal representative profile that the useris close to corresponds to the representative user's profile in thematching subsystem. Hence matching subsystem will offer experiments to aset of users for each ideal representative profile or profile id.

[0090] There could be several other embodiments for implementing theinvention. In one embodiment, the online and offline storage mediumcould be a single entity especially if the offline sampling subsystemand the offline qualification subsystem is absent.

[0091] In another embodiment, the online qualification subsystem maycheck the offline selection criteria at run-time and the offlinequalification subsystem doesn't exist.

[0092] In yet another embodiment the merchant can copy an existingexperiment. The system may force the merchant to change/re-specifycertain parameters while retaining others (which can also be modified).For example, the merchant may have to specify a new name to theexperiment. Also, the system may ask the merchant to re-specify the newSTART and END date and time.

[0093] In another embodiment the merchant can modify an existingexperiment. The merchant can modify the experiment prior to start timeof the experiment. The system may not allow the merchant to modify someof the parameters of the experiment. For example, the merchant may notbe allowed to change the experiment name. Some parameters may be allowedto be changed only in certain fashion. For example, the system may notallow a reduction in the allocated budget (or reduction may be allowedbut only to the extent unutilized). However, the merchant may be allowedto increase the budget anytime.

[0094] In another embodiment for each ideal representative profile, theactive learning may specify the number of users to collect theexperiment responses from.

[0095] In another embodiment the active learning may also be modified toselect actual users whose features are already known and the experimentmay be offered to these users through e-mail or other communicationchannels like direct mail, telephone (call centers), chat interfaces,etc.

[0096] In yet another embodiment, an outside agent sets the order inwhich the budget subsystem, the online qualification criterion, thesmart sampling subsystem and the matching subsystem are executed.

[0097] In another embodiment, an optimization algorithm decides theorder in which the budget subsystem, the online qualification criterion,the smart sampling subsystem and the matching subsystem are executed.The relative execution times of each of the subsystems and the relevanceof each for the experiment are taken as input and a learning algorithm(for example, reinforcement learning) learns the optimal schedule.

[0098] In another embodiment, the offline criteria may be specified onlyfor the registered users as only these users can be identified in anoffline mode. For the unregistered users, i.e. users who can not beidentified with any previous visits, only online criteria may bespecified. Further the online criteria can be used in addition to theoffline criteria for the registered users.

[0099] In another embodiment, an online version of the offline criteriamay also be stored which may check for new users who register after thepreselection has been done or a dynamic change in user features whichmay enable them to be preselected.

[0100] In another embodiment, analysis of the user response is availableonline. In another embodiment, a budget prediction algorithm computesthe expected number of redemptions of coupons given the number ofcoupons acceptance, based on historic redemption rates for differenttypes of product and products categories, at different points in timewhen the promotion was on for different periods of time and fordifferent time available for redemption.

[0101] In another embodiment the merchant may also specify a policy tobe used in utilizing the budget.

[0102] (a) The total budget available for the study may include theadvertising costs and not just, the cost of change in marketing mix.(for example, price reduction through coupons, includes discount givenat the time of redemption). (YES/NO)

[0103] (b) The system may offer a fixed change in marketing mix variable(for example, a fixed discount) which has a fixed cost x to the y numberof customers, such that x*y+advertisement cost of the discounts=budget.

[0104] (c) Since a large number of offers (for example,discount/coupon/price reduction offers) may be ignored, this may bemodified to x*z+advertisement cost of the discounts=budget, where z isthe number of participants who have accepted the offers. The systemkeeps track of number of acceptances in any point in time and stopsoffering to other users, once the number of acceptances exceed thebudget.

[0105] In another embodiment, the merchant defines event-based triggersfor deployment, monitoring, intervention and review of experiments. Themonitoring is done by defining Event Listeners (as for instance definedin Java API) and the action part of the event listeners could bedeployment of the experiment. The merchant can also define event-basedtriggers to stop the experiment or modify the parameters of theexperiment. Besides an event-based control, the merchant can monitor thestatistics of the experiment and intervene manually to change the courseof the experiment.

[0106] In one instance of the preferred embodiment, the smart samplingsubsystem computes the information gain of the online users and selectsthe users, which provide an information gain more than a certainthreshold. The threshold may be dynamically determined to attain therequired number of respondents in the given time frame. The key inputsto the threshold-determining algorithm include expected arrival rate ofusers, a spatial distribution of users along the user feature space andthe expected response rate of the individual users. A dynamicprogramming, genetic algorithms, reinforcement learning algorithms canbe used for learning the threshold for information gain. Maintaining anumber of online classifiers can do the information gain computation anda vote between the classifiers helps select the users with maximumdisagreement

[0107] In another embodiment, the budget subsystem uses the time periodremaining for the experiment to be completed.

[0108] In another embodiment, the smart sampling subsystem uses the timeperiod remaining for the experiment to be completed. If the timeremaining is greater than a threshold, it may wait to get the userscloser to the ideal representative profiles. Otherwise it may increasethe allowed distance of users from the ideal representative profiles.

[0109] In another embodiment, the marketing research experimentterminates when the budget has been exhausted or the required size ofthe selected sample has been reached or the historic or forecastedincremental gain arising from additional sampling has reached a certainthreshold limit or the sample size collected so far is expected toreduce the statistical error below a threshold limit.

[0110] In another embodiment, the expected information gain fromadditional respondents may be translated into expected gain inprediction accuracy and the monetary gain equivalent to improvement inthe prediction accuracy. The business can compute the value ofadditional information and compare it with the cost of collecting theadditional information from other respondents. The marketing researchexperiment may thus terminate once the incremental value declines belowthe cost of information acquisition.

[0111] In another embodiment, the merchant specifies the resultcomparison parameters, whether the result comparison is to be made withhistorical parameters or with respect to a controlled group. Forhistorical comparison, the merchant specifies the comparable timeperiod.

[0112] It will be apparent to those with ordinary skill in the art thatthe foregoing is merely illustrative and not intended to be exhaustiveor limiting, having been presented by way of example only and thatvarious modifications can be made within the scope of the aboveinvention. The present invention can be realized in hardware, softwareor a combination of hardware and software. The modules as described inthe invention could either be realized in a centralized manner, on onecomputer system could be spread across several interconnected computersystems. Any kind of computer system or other apparatus adapted forcarrying out the methods described herein is suited. A typicalcombination of hardware and software could be a general purpose computersystem with a computer program that, when loaded and executed, controlsthe computer system such that it carries out the methods describedherein.

[0113] Accordingly, this invention is not to be considered limited tothe specific examples chosen for purposes of disclosure, but rather tocover all changes and modifications, which do not constitute departuresfrom the permissible scope of the present invention. The invention istherefore not limited by the description contained herein or by thedrawings, but only by the claims.

What is claimed is:
 1. A method for conducting controlled onlinemarketing research with budgetary control comprising the steps of: a.identifying and specifying the experiment parameters, b. implementingsmart sampling to determine whether the current user should be selectedfor the market research by: (i) determining the ideal representativeuser profiles so as to maximise the expected information gain, (ii)comparing the current user's profile with existing ideal representativeuser profiles, wherein a set of users with similar profiles areassociated with the same ideal representative profile, c. assigning theselected user to any one of the available experimental or controlgroups, d. administering the experiment to the selected user if thespecified budget is sufficient, e. repeating steps b(ii), (c), and (d)until the specified budget is expended and/or a termination condition issatisfied, and f. analyzing and reporting the results.
 2. The method asclaimed in claim 1 wherein the step of identifying and specifying theexperiment parameters includes specifying the budget for the experiment,criteria for user selection, the smart sampling parameters, number ofexperimental and control group(s), and implementation details, such asmarketing variables to be tested and user response variables to bemeasured or surveys to be administered.
 3. The method as claimed inclaim 2 wherein the budget is specified in terms of cost, time, or as alimit on the number of participants.
 4. The method as claimed in claim 3wherein the cost of conducting the experiment differs across theexperimental or control groups and the sufficiency of budget isdetermined by comparing the cost for the assigned group with theavailable budget.
 5. The method as claimed in claim 3 wherein the costof conducting the experiment is not deterministically known and thebudget available is the estimated budget based on expected response ofthe users, using techniques including prediction and/or a learningalgorithm.
 6. The method as claimed in claim 1 wherein each idealrepresentative profile selected for comparison must have at least onevacancy in the experimental or control group(s).
 7. The method asclaimed in claim 1 wherein the current user is assigned one of theavailable experimental or control groups such that there is minimaldisparity among the relative size of the groups.
 8. The method asclaimed in claim 1 wherein the number of users assigned to an experimentor control group(s) for a given ideal representative profile is morethan or equal to one.
 9. The method as claimed in claim 1 wherein thesmart sampling technique uses any known active learning algorithm or adistance metric selected from a set of metrics.
 10. The method asclaimed in claim 2 wherein the smart sampling parameters may include thethreshold distance, remaining budget, expected arrival rate of users, aspatial distribution of users in the user feature space and the expectedresponse rate of the individual users.
 11. The method as claimed inclaim 1 wherein the ideal representative profiles are re-computedperiodically and steps b(ii), (c), and (d) are repeated for the new setof users.
 12. The method as claimed in claim 1, wherein the terminationcondition includes the size of the selected sample, the expectedstatistical error, the historic or forecasted incremental gain arisingfrom additional sampling, or the expected incremental value ofadditional information declining below the cost of acquisition ofadditional samples.
 13. The method as claimed in claim 10 wherein forthe remaining time period being less than a threshold, the thresholddistance of users from the ideal representative profiles is increased.14. The method as claimed in claim 1 wherein the step of selecting theuser further comprises the steps of: ascertaining whether the user is anexisting users and tracking user browsing events and determining whetherthe user qualifies for the online user selection criteria for a new useror a user who is an existing user and has been qualified in the offlineprocess.
 15. A method for conducting controlled online marketingresearch with budgetary control comprising the steps of: a. identifyingand specifying the experiment parameters, b. comparing the currentuser's profile with existing representative user profiles, wherein a setof users associated with the representative profile have one to onecorrespondence, c. creating a new representative user profile withcurrent user's profile, for user that does not match with any of theexisting representative user profiles or associating the current userwith the closest representative user profile, d. assigning the currentuser to any one of the available experimental or control groups, e.administering the experiment to the selected user if the specifiedbudget is sufficient, f. repeating steps (b), (c), (d) and (e) until thespecified budget is expended and/or a termination condition issatisfied, and g. analyzing and reporting the results.
 16. The method asclaimed in claim 15 wherein an algorithm dynamically determines whethera new representative profile would be created for the current user ornot.
 17. The method as claimed in claim 16 wherein the algorithm usesthe available budget, expected information gain by selecting the user,the time remaining for the completion of the experiment, and/or thenumber of vacant representative profiles in the system.
 18. The methodas claimed in claim 15 wherein the step of identifying and specifyingthe experiment parameters includes specifying the budget for theexperiment, criteria for user selection, number of experimental andcontrol group(s), and implementation details, such as marketingvariables to be tested and user response variables to be measured orsurveys to be administered.
 19. The method as claimed in claim 15wherein the cost of conducting the experiment differs across theexperimental or control groups and the sufficiency of budget isdetermined by comparing the cost for the assigned group with theavailable budget.
 20. The method as claimed in claim 15 wherein the costof conducting the experiment is not deterministically known and thebudget available is the estimated budget based on expected response ofthe users, using techniques including prediction and/or a learningalgorithm.
 21. The method as claimed in claim 15 wherein eachrepresentative profile selected for comparison has at least one vacancyin the experimental or control group(s).
 22. The method as claimed inclaim 15 wherein the current user is assigned one of the availableexperimental or control groups such that there is minimal disparityamong the relative size of the groups.
 23. The method as claimed inclaim 15 wherein the number of users assigned to an experiment orcontrol group(s) for a given representative profile can be more than orequal to one.
 24. The method as claimed in claim 15 wherein the step ofselecting the user further comprises the steps of: ascertaining whetherthe user is an existing user, and tracking user's browsing events anddetermining whether the user qualifies for the online user selectioncriteria for a new user or a user who is an existing user and has beenqualified in the offline process.
 25. A system for conducting controlledonline marketing research with budgetary control comprising: a parameterspecifier that identifies the experiment parameters, b representativeprofile generator that determines the ideal representative user profilesso as to maximise the expected information gain, c smart sampler todetermine whether the current user should be selected for the marketresearch by: (i) determining the ideal representative user profiles soas to maximise the expected information gain, (ii) comparing the currentuser's profile with existing ideal representative user profiles, whereina set of users with similar profiles are associated with the same idealrepresentative profile, d assignor that assigns the current user to anyone of the available experimental or control groups, e administeringmeans that administers the experiment to the selected user if thespecified budget is sufficient, f analyzer that analyzes and reports theresults.
 26. The system as claimed in claim 25 wherein the parameterspecifier is a computer implemented tool for defining the experimentparameters including the budget for the experiment, criteria for userselection, the smart sampling parameters, number of experimental andcontrol group(s), and implementation details, such as marketingvariables to be tested and user response variables to be measured orsurveys to be administered.
 27. The system as claimed in claim 25wherein the representative profile generator is a computer implementedtool.
 28. The system as claimed in claim 25 wherein the smart sampler isa computer implemented tool that uses any known active learningalgorithm or a distance metric selected from a set of predeterminedmetrics.
 29. The system as claimed in claim 27 wherein therepresentative profile generator generates the ideal representativeprofiles periodically.
 30. The system as claimed in claim 25 wherein thesmart sampler further comprises: means for ascertaining whether the useris an existing user, and means for tracking user's browsing events anddetermining whether the user qualifies for the online user selectioncriteria for a new user or a user who is an existing user and has beenqualified in the offline process.
 31. A system for conducting controlledonline marketing research with budgetary control comprising of: a.parameter specifier for specifying the experiment parameters, b. userselector for comparing the current user's profile with existingrepresentative user profiles, wherein a set of users associated with therepresentative profile have one to one correspondence, c. representativeprofile generator for creating a new representative user profile withthe current user's profile, for users that do not match with any of theexisting representative user profiles or associating the current userwith the closest representative user profile, d. assignor for assigningthe current user to any one of the available experimental or controlgroups, e. administering means for administering the experiment to theselected user if the specified budget is sufficient, f. analyzer foranalyzing and reporting the results.
 32. The system as claimed in claim31 wherein the representative profile generator uses an algorithm thatdynamically determines whether a new representative profile would becreated for the current user or not.
 33. The system as claimed in claim31 wherein the parameter specifier is a computer implemented tool forspecifying the budget for the experiment, criteria for user selection,number of experimental and control group(s), and implementation details,such as marketing variables to be tested and user response variables tobe measured or surveys to be administered.
 34. The system as claimed inclaim 31 wherein the user selector is a computer implemented toolfurther comprising: means for ascertaining whether the user is anexisting user, and means for tracking user's browsing events anddetermining whether the user qualifies for the online user selectioncriteria for a new user or a user who is an existing user and has beenqualified in the offline process.
 35. The system as claimed in claimedin claim 25 and 31 wherein the said tools wholly or partially reside ona computing system comprising of: at least one system bus, at least onecommunications unit connected to the system bus, a memory unit includinga set of instructions connected to the system bus, and at least onecontrol unit executing the instructions in the memory for thefunctioning of the tools.
 36. The system as claimed in claim 35 furtherconnected to other similar systems and database systems that may containtools to complement and supplement the already existing tools andlibraries present.
 37. The system as claimed in claim 36 wherein thesaid systems are interconnected through any suitable computer networkincluding Ethernet, Internet, LAN, WAN, and MAN using any desirednetwork topology including ring, bus and star.
 38. A computer programproduct residing on computer readable media containing computer readableprograms code for causing a computer to conduct controlled onlinemarketing research with budgetary control comprising: a. computerreadable program code means configured for identifying and specifyingthe experiment parameters, b. computer readable program code configurefor implementing smart sampling for determining whether the current usershould be selected for the market research by: (i) determining the idealrepresentative user profiles so as to maximise the expected informationgain, (ii) comparing the current user's profile with existing idealrepresentative user profiles, wherein a set of users with similarprofiles are associated with the same ideal representative profile, c.computer readable program code means configured for assigning theselected user to any one of the available experimental or controlgroups, d. computer readable program code means configured foradministering the experiment to the selected user if the specifiedbudget is sufficient, e. computer readable program code means configuredfor repeating steps (b)ii, (c), and (d) until the specified budget isexpended and/or the required data is collected, and f. computer readableprogram code means configured for analyzing and reporting the results.39. The computer program product as claimed in claim. 38 whereincomputer readable program code means configured for of identifying andspecifying the experiment parameters includes computer readable programcode means configured for specifying the budget for the experiment,criteria for user selection, the smart sampling parameters, number ofexperimental and control group(s), and implementation details, such asmarketing variables to be tested and user response variables to bemeasured or surveys to be administered.
 40. The computer program productas claimed in claim 39 including computer readable program code meansconfigured to enable specifying the budget in terms of cost, time, or asa limit on the number of participants.
 41. The computer program productas claimed in claim 40 including computer readable program code meansconfigured for determining sufficiency of the budget when the cost ofconducting the experiment differs across the experimental or controlgroups by comparing the cost for the assigned group with the availablebudget.
 42. The computer program product as claimed in claim 40including computer readable program code means configured fordetermining budget sufficiency when the cost of conducting theexperiment is not deterministically known and the budget available isthe estimated budget based on expected response of the users, usingtechniques including prediction and/or a learning algorithm.
 43. Thecomputer program product as claimed in claim 38 wherein each idealrepresentative profile selected for comparison has at least one vacancyin the experimental or control group(s).
 44. The computer programproduct as claimed in claim 38 including computer readable program codemeans configured for assigning the current to one of the availableexperimental or control groups such that there is minimal disparityamong the relative size of the groups.
 45. The method as claimed inclaim 38 including computer readable program code means configured forassigning more than 1 user to an experiment or control group(s) for agiven ideal representative profile.
 46. The computer program product asclaimed in claim 38 including computer readable program code meansconfigured for the smart sampling uses any known active learningalgorithm or a distance metric selected from a set of predeterminedmetrics.
 47. The computer program product as claimed in claim 38including computer readable program code means configured for settingthe smart sampling parameters includes threshold distance, expectedarrival rate of users, a spatial distribution of users along the userfeature space and the expected response rate of the individual users asthe desired parameters.
 48. The computer program product as claimed inclaim 38 including computer readable program code means configured forrecomputing the ideal representative profiles periodically and repeatingsteps (b)ii, (c), and (d) for the new set of users.
 49. The computerprogram product as claimed in claim 38, including computer readableprogram code means configured for dynamically adjusting the smartsampling based on the time period remaining for the experiment to becompleted.
 50. The computer program product as claimed in claim 38including computer readable program code means configured for selectingthe user further comprises: computer readable program code meansconfigured for ascertaining whether the user is an existing user, andcomputer readable program code means configured for tracking userbrowsing events and determining whether the user qualifies for theonline user selection criteria for a new user or a user who is anexisting user and has been qualified in the offline process.
 51. Acomputer program product for conducting controlled online marketingresearch with budgetary control comprising: a. computer readable programcode means configured for identifying and specifying the experimentparameters, b. computer readable program code means configured forcomparing the current user's profile with existing representative userprofiles, wherein a representative profile is associated with a set ofusers whose profiles have one to one correspondence, c. computerreadable program code means configured for creating a new representativeuser profile with current user's profile, for user that do not matchwith any of the existing representative user profiles or associating thecurrent user with the closest representative user profile, d. computerreadable program code means configured for assigning the current user toany one of the available experimental or control groups, e. computerreadable program code means configured for administering the experimentto the selected user if the specified budget is sufficient, f. computerreadable program code means configured for repeating steps (b), (c), (d)and (e) until the specified budget is expended and/or the required datais collected, and g. computer readable program code means configured foranalyzing and reporting the results.
 52. The computer program product asclaimed in claim 51 including computer readable program code meansconfigured for dynamically determining whether or not a newrepresentative profile is to be created for the current user.
 53. Thecomputer program product as claimed in claim 52 including computerreadable program code means configured for using the available budget,expected information gain by selecting the user, the time remaining forthe completion of the experiment, and/or the number of vacantrepresentative profiles in the system.
 54. The computer program productas claimed in claim 51 including computer readable program code meansconfigured for identifying and specifying the experiment parametersincluding specifying the budget for the experiment, criteria for userselection, number of experimental and control group(s), andimplementation details, such as marketing variables to be tested anduser response variables to be measured or surveys to be administered.55. The computer program product as claimed in claim 51 includingcomputer readable program code means configured for determiningsufficiency of budget when the cost of conducting the experiment differsacross the experimental or control groups by comparing the cost for theassigned group with the available budget.
 56. The computer programproduct as claimed in claim 55 including computer readable program codemeans configured for the cost of conducting the experiment is notdeterministically known and the budget available is the estimated budgetbased on expected response of the users, using techniques includingprediction and/or a learning algorithm.
 57. The computer program productas claimed in claim 51 computer readable program code means configuredfor ensuring that each representative profile selected for comparisonhas at least one vacancy in the experimental or control group(s). 58.The computer program product as claimed in claim 51 including computerreadable program code means configured for assigning the current user toone of the available experimental or control groups such that there isminimal disparity among the relative size of the groups.
 59. Thecomputer program product as claimed in claim 51 including computerreadable program code means configured for assigning more than 1 user toan experiment or control group(s) for a given representative profile.60. The computer program product as claimed in claim 51 includingcomputer readable program code means configured for: ascertainingwhether the user is an existing user, and tracking user's browsingevents and determining whether the user qualifies for the online userselection criteria for a new user or a user who is an existing user andhas been qualified in the offline process.